library(printr)
library(fpp2)
library(tidyverse)
library(ggthemes)
library(jtools)
theme_set(theme_clean())Here are the optimal paramaters R comes up with.
## Simple exponential smoothing
##
## Call:
## ses(y = pigs, h = 4)
##
## Smoothing parameters:
## alpha = 0.2971
##
## Initial states:
## l = 77260.0561
##
## sigma: 10308.58
##
## AIC AICc BIC
## 4462.955 4463.086 4472.665
Here is are the forecasts for the next 4 months.
pigs.ses %>%
kableExtra::kbl(caption = 'Pigs SES Forecasts') %>%
kableExtra::kable_styling(full_width = F)| Point Forecast | Lo 80 | Hi 80 | Lo 95 | Hi 95 | |
|---|---|---|---|---|---|
| Sep 1995 | 98816.41 | 85605.43 | 112027.4 | 78611.97 | 119020.8 |
| Oct 1995 | 98816.41 | 85034.52 | 112598.3 | 77738.83 | 119894.0 |
| Nov 1995 | 98816.41 | 84486.34 | 113146.5 | 76900.46 | 120732.4 |
| Dec 1995 | 98816.41 | 83958.37 | 113674.4 | 76092.99 | 121539.8 |
And it is also presented as a plot.
SES <- function(y, alpha, level){
y_hat <- level
for(index in 1:length(y)){
y_hat <- alpha*y[index] + (1 - alpha)*y_hat
}
return(y_hat)
}
alpha <- pigs.ses$model$par[1]
level <- pigs.ses$model$par[2]
fc <- SES(pigs, alpha = alpha, level = level)
paste('The next forecast using my function is: ', fc)## [1] "The next forecast using my function is: 98816.4061115907"
This forecast is identical to the ses() forecast of 98816.41.
# modify SES function to return SSE
SES <- function(params = c(alpha, level), y) {
error <- 0
SSE <- 0
alpha <- params[1]
level <- params[2]
y_hat <- level
for(index in 1:length(y)){
error <- y[index] - y_hat
SSE <- SSE + error^2
y_hat <- alpha*y[index] + (1 - alpha)*y_hat
}
return(SSE)
}
opt_SES_pigs <- optim(par = c(0.5, pigs[1]), y = pigs, fn = SES)
paste('The parameters produced are : ', opt_SES_pigs$par[1], opt_SES_pigs$par[2])## [1] "The parameters produced are : 0.299008094014243 76379.2653476235"
Here we have that the alpha is pretty much the same although the initial level is different.
SES <- function(init_params, data){
fc_next <- 0
SSE <- function(params, data){
error <- 0
SSE <- 0
alpha <- params[1]
level <- params[2]
y_hat <- level
for(index in 1:length(data)){
error <- data[index] - y_hat
SSE <- SSE + error^2
y_hat <- alpha*data[index] + (1 - alpha)*y_hat
}
fc_next <<- y_hat
return(SSE)
}
optim_pars <- optim(par = init_params, data = data, fn = SSE)
return(list(
next_fc = fc_next,
alpha = optim_pars$par[1],
level = optim_pars$par[2]
))
}
mycalc <- SES(c(0.5, pigs[1]), pigs)
paste("Next observation forecast by ses function is: ", pigs.ses$mean[1], 'and my function returns ', mycalc$next_fc)## [1] "Next observation forecast by ses function is: 98816.4061115907 and my function returns 98814.6336394835"
paste("The alpha calculated by ses function is: ", pigs.ses$model$par[1], 'and my function returns', mycalc$alpha)## [1] "The alpha calculated by ses function is: 0.297148833372095 and my function returns 0.299008094014243"
paste("The level calculated from the ses function is ", pigs.ses$model$par[2], 'and my function returns', mycalc$level)## [1] "The level calculated from the ses function is 77260.0561458528 and my function returns 76379.2653476235"
## [1] "The RMSE of the paperback books holt model is: 31.1369230162347"
## [1] "The RMSE of the hardcover books holt is: 27.1935779818511"
First we compare the paperback sales.
Then the hardcover sales.
The Holt model performs better in both instances.
paper.holt.sd <- sd(paper.holt$residuals)
paper.holt.upper <- paper.holt$mean[1] + 1.96*paper.holt.sd
paper.holt.lower <- paper.holt$mean[1] - 1.96*paper.holt.sd
paste("The holt interval for the paperback is: ", paste(paper.holt.lower, paper.holt.upper, sep = ', '))## [1] "The holt interval for the paperback is: 147.839010441387, 271.094520600715"
paper.ses.sd <- sd(paper.ses$residuals)
paper.ses.upper <- paper.ses$mean[1] + 1.96*paper.ses.sd
paper.ses.lower <- paper.ses$mean[1] - 1.96*paper.ses.sd
paste("The ses interval for the paperback is: ", paste(paper.ses.lower, paper.ses.upper, sep = ', '))## [1] "The ses interval for the paperback is: 141.596371851814, 272.622958138623"
hard.holt.sd <- sd(hard.holt$residuals)
hard.holt.upper <- hard.holt$mean[1] + 1.96*hard.holt.sd
hard.holt.lower <- hard.holt$mean[1] - 1.96*hard.holt.sd
paste("The holt interval for the hardback is: ", paste(hard.holt.lower, hard.holt.upper, sep = ', '))## [1] "The holt interval for the hardback is: 195.963968137356, 304.383776287355"
hard.ses.sd <- sd(hard.ses$residuals)
hard.ses.upper <- hard.ses$mean[1] + 1.96*hard.ses.sd
hard.ses.lower <- hard.ses$mean[1] - 1.96*hard.ses.sd
paste("The ses interval for the hardback is: ", paste(hard.ses.lower, hard.ses.upper, sep = ', '))## [1] "The ses interval for the hardback is: 178.584828931457, 300.535355072066"
The holt() intervals are upwardly biased vis-a-vis the ses() model confidence interval.
Lets take a look at the eggs dataset.
autoplot(eggs) +
labs(title = 'Price of a dozen eggs in the United States',
y = 'Price',
x = 'Year')I am going to use 4 years as a holdout set. First we look at the forecasts with the plain holt() model.
accuracy(holt(eggs.train, h = 100), x = eggs.test) %>%
kableExtra::kbl(caption = 'Holt Performance') %>%
kableExtra::kable_styling(full_width = F)| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 0.2076337 | 27.130174 | 19.858538 | -1.042284 | 9.832292 | 0.9465410 | 0.0118883 |
| Test set | 3.7769587 | 5.422198 | 4.483322 | 4.713905 | 5.823012 | 0.2136939 | NA |
Next use the holt() dampened model.
accuracy(holt(eggs.train, h = 100, damped = T), x = eggs.test) %>%
kableExtra::kbl(caption = 'Holt Dampened Performance') %>%
kableExtra::kable_styling(full_width = F)| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -3.186974 | 27.308027 | 20.179474 | -2.922321 | 10.22192 | 0.9618382 | -0.0044512 |
| Test set | -5.458580 | 9.295453 | 7.219194 | -8.748214 | 10.94594 | 0.3440970 | NA |
Lastly, lets try a holt() model with exponential = TRUE.
accuracy(holt(eggs.train, h = 100, damped = T, exponential = T), x = eggs.test) %>%
kableExtra::kbl(caption = 'Holt Dampened and Exponential Performance') %>%
kableExtra::kable_styling(full_width = F)| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -2.224151 | 27.246634 | 20.218401 | -2.593662 | 10.21157 | 0.9636936 | 0.0121702 |
| Test set | -5.287971 | 9.184952 | 7.172699 | -8.507895 | 10.86064 | 0.3418809 | NA |
On the training set, the regularholt() model performs best when looking at the RMSE and on the test set.
We can see that the seasonally adjusted graph is much smoother.
## ETS(A,N,A)
##
## Call:
## ets(y = ukcars)
##
## Smoothing parameters:
## alpha = 0.6199
## gamma = 1e-04
##
## Initial states:
## l = 314.2568
## s = -1.7579 -44.9601 21.1956 25.5223
##
## sigma: 25.9302
##
## AIC AICc BIC
## 1277.752 1278.819 1296.844
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 1.313884 25.23244 20.17907 -0.1570979 6.629003 0.6576259
## ACF1
## Training set 0.02573334
The ets() chose an additive exponential and seasonal component.
accuracy(add.damp.stl)%>%
kableExtra::kbl(caption = 'Damp Stl Performance') %>%
kableExtra::kable_styling(full_width = F)| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 1.47663 | 21.29656 | 16.60585 | 0.167431 | 5.330547 | 0.5383996 | 0.0060621 |
accuracy(holtfc)%>%
kableExtra::kbl(caption = 'Holt Performance') %>%
kableExtra::kable_styling(full_width = F)| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -0.2811218 | 21.31828 | 16.5201 | -0.4191414 | 5.33054 | 0.5356195 | 0.0084781 |
accuracy(cars.ets)%>%
kableExtra::kbl(caption = 'ETS Performance') %>%
kableExtra::kable_styling(full_width = F)| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 1.313884 | 25.23244 | 20.17907 | -0.1570979 | 6.629003 | 0.6576259 | 0.0257333 |
We can see that the best in sample fits come from the dampened stl model.
I would think that the ETS has the most reasonable forecast.
We can see a monthly seasonal component coupled with what looks like an additive upward trend.
Lets fit the model on the training set using hw(), then forecast the test set with the test set on the same plot.
holt.visit <- hw(visit.train, h = 24,
seasonal = 'multiplicative')
autoplot(visit.test) +
autolayer(holt.visit, PI = F, series = 'Holt-Winters Forecast') +
labs(title = 'Test Set vs Holt Winters Multiplicative Forecast',
y = 'Thousands of People',
x = 'Year')We can see that the forecasts miss the mark the longer we move in the forecast.
We need a multiplicative seasonal component in our model because the seasonal variation changes the farther out the series gets.
ets() model forecasts are shown below.
visit.ets.fc <- forecast(ets(visit.train), h = 24)
autoplot(visit.test) +
autolayer(holt.visit, PI = F, series = 'Holt-Winters Forecast') +
autolayer(visit.ets.fc, PI = F, series = 'ETS Forecast') +
labs(title = 'Test Set vs Holt and ETS',
y = 'Thousands of People',
x = 'Year')We can see that the results form the ETS are very similar to the Holt-Winters, this is because the ETS selects the Holt-Winters as the model of best fit.
Lets use a box-cox transformation first.
visit.ets.bx.fc <- forecast(ets(visit.train, lambda = BoxCox.lambda(visit.train),
additive.only = T), h = 24)
autoplot(visit.test) +
autolayer(holt.visit, PI = F, series = 'Holt-Winters Forecast') +
autolayer(visit.ets.bx.fc, PI = F, series = 'Box-Cox ETS Forecast') +
labs(title = 'Test Set vs Holt and Transformed ETS',
y = 'Thousands of People',
x = 'Year')We have the same outcome.
Lets look at the seasonal naive model.
visit.naive <- snaive(visit.train, h = 24)
autoplot(visit.test) +
autolayer(holt.visit, PI = F, series = 'Holt-Winters Forecast') +
autolayer(visit.naive, PI = F, series = 'Naive Seasonal Forecast') +
labs(title = 'Test Set vs Holt and Naive Seasonal ETS',
y = 'Thousands of People',
x = 'Year')The naive seasonal forecast seems to do better that the Holt-Winters or our previous ETS models.
Lets take a look at an ETS with an STL decomposition.
visit.ets.bx.stl.ets <- visit.train %>% stlm(lambda = BoxCox.lambda(visit.train), s.window = 13,
robust=TRUE, method="ets") %>% forecast(h=24)
autoplot(visit.test) +
autolayer(holt.visit, PI = F, series = 'Holt-Winters Forecast') +
autolayer(visit.ets.bx.stl.ets, PI = F, series = 'STL ETS Forecast') +
labs(title = 'Test Set vs Holt and STL ETS Forecast',
y = 'Thousands of People',
x = 'Year')It seems as though the STL ETS forecast has the best results.
cbind(c('Holt', 'ETS', 'ETS BC', 'ETS STL'), rbind(accuracy(holt.visit, x = visit.test)[2,], rbind(accuracy(visit.ets.fc, x = visit.test)[2,]),
accuracy(visit.ets.bx.fc, x = visit.test)[2,], accuracy(visit.ets.bx.stl.ets, x = visit.test)[2,])) %>% kableExtra::kbl(caption = 'Model Performance Comparison') %>%
kableExtra::kable_styling(full_width = F)| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|---|
| Holt | 25.4830766534271 | 43.4420287633812 | 38.398885881146 | 4.70104834780006 | 9.07945837579195 | 1.47545453121702 | 0.732916520843915 | 0.609814122840841 |
| ETS | 40.8135505501884 | 55.3444507819069 | 50.1376394587537 | 8.35440183075155 | 11.5847274554045 | 1.92650921052545 | 0.706557377477365 | 0.776453035691473 |
| ETS BC | 34.5705565563462 | 53.3644820444243 | 47.4263051349009 | 6.6016624016935 | 11.004487054124 | 1.82232778906038 | 0.701674559916249 | 0.751391090955704 |
| ETS STL | 13.6862670094568 | 28.4344048454772 | 23.9469593678648 | 2.2765524030933 | 5.75308913698937 | 0.920147782869254 | 0.667972704374057 | 0.409320937024245 |
If we go by MASE, the last model, the ETS STL model is the best fit, doing better (as opposes to worse) than Naive forecasts.
##
## Ljung-Box test
##
## data: Residuals from STL + ETS(M,A,N)
## Q* = 14.156, df = 20, p-value = 0.8225
##
## Model df: 4. Total lags used: 24
We can see good residual results for the STL ETS model.
fets_add_BoxCox <- function(y, h) {
forecast(ets(
y,
lambda = BoxCox.lambda(y),
additive.only = TRUE
),
h = h)
}
fstlm <- function(y, h) {
forecast(stlm(
y,
lambda = BoxCox.lambda(y),
s.window = frequency(y) + 1,
robust = TRUE,
method = "ets"
),
h = h)
}
fets <- function(y, h) {
forecast(ets(y),
h = h)
}
# Lets use RMSE for easier comparison
paste('Holt-Winters RMSE: ', sqrt(mean(tsCV(visitors, hw, h = 24,
seasonal = "multiplicative")^2,
na.rm = TRUE)))## [1] "Holt-Winters RMSE: 40.1553355617822"
## [1] "ETS RMSE: 38.3396333495648"
## [1] "ETS BoxCox RMSE: 37.8521583384252"
## [1] "Seasonal Naive RMSE: 41.3571612878567"
## [1] "ETS STL RMSE: 38.1583447532845"
We can see from the tsCV() results that the ETS BoxCox transformation RMSE is lower than the ETS STL RMSE, but not by too much.
Below we have the tsCV() results.
## [1] "Seasonl Naive RMSE: 0.133871084439129"
## [1] "ETS RMSE: 0.11184604637948"
We can see that the ETS performs better.
For all three ACF graphs, we have that the autocorrelation is insignificant. This does in fact mean that we are looking at white noise - the random numbers that are produced are not related to one another over time.
The space of the critical values away from the residuals occrus due to the sample size, which narrows down the confidence interval the larger it is.
We can see that the closing price of IBM that every lag of the ACF is significant for autocorrelation. On the PACF, however, it is only the first value so we should at least use the first difference when forecasting the stock price but it is not stationary - it will move over time.
For each of the below, we can get the appropriate number of differences by using ndiffs()
ar1 <- function(phi){
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100){
y[i] <- phi*y[i-1] + e[i]
}
return(y)
}Below we plot the different generated graphs.
autoplot(ar1(0.3), series = "0.3") +
geom_line(size = 1, colour = "red") +
autolayer(y, series = "0.6", size = 1) +
autolayer(ar1(0.9), size = 1, series = "0.9") +
ylab("AR(1) models") +
guides(colour = guide_legend(title = "Phi"))autoplot(ma1(0.3), series = "0.3") +
geom_line(size = 1, colour = "red") +
autolayer(y, series = "0.6", size = 1) +
autolayer(ma1(0.9), size = 1, series = "0.9") +
ylab("MA(1) models") +
guides(colour = guide_legend(title = "Theta"))It seems like one differencing should make the data stationary.
## [1] 2
The ndiffs() function says that we need 2 differencing.
We get stationary data after differencing twice.
I wouldn’t include a constant in the model since it will be integrated twice and generate a quadratic trend which can swing the forecasts one way or the other strongly.
\[ (1 - B)^2*yt = (1 + theta_1*B + theta_2*B^2)*et \] ### d)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,2,2)
## Q* = 11.764, df = 8, p-value = 0.1621
##
## Model df: 2. Total lags used: 10
The residuals don’t look too bad, but they could be more normal.
## Time Series:
## Start = 2005
## End = 2007
## Frequency = 1
## [1] 2.480525 2.374890 2.269256
## Series: wmurders
## ARIMA(0,2,2)
##
## Coefficients:
## ma1 ma2
## -1.0181 0.1470
## s.e. 0.1220 0.1156
##
## sigma^2 estimated as 0.04702: log likelihood=6.03
## AIC=-6.06 AICc=-5.57 BIC=-0.15
To calculate by hand we plug back into our formula.
years <- length(wmurders)
e <- fc_wmurders_arima.0.2.2$residuals
fc1 <- 2*wmurders[years] - wmurders[years - 1] - 1.0181*e[years] + 0.1470*e[years - 1]
fc2 <- 2*fc1 - wmurders[years] + 0.1470*e[years]
fc3 <- 2*fc2 - fc1Here are our forecasts:
## [1] 2.480523 2.374887 2.269252
They are similar to the forecasts from the actual function.
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -0.0113461 | 0.2088162 | 0.1525773 | -0.2403396 | 4.331729 | 0.9382785 | -0.0509407 |
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | -0.0106596 | 0.2072523 | 0.1528734 | -0.2149476 | 4.335214 | 0.9400996 | 0.0217634 |
We can see that the Arima model that we came up with does better on the training set when comparing by the MASE. But this difference is not significant.
## Series: austa
## ARIMA(0,1,1) with drift
##
## Coefficients:
## ma1 drift
## 0.3006 0.1735
## s.e. 0.1647 0.0390
##
## sigma^2 estimated as 0.03376: log likelihood=10.62
## AIC=-15.24 AICc=-14.46 BIC=-10.57
We can see that the ARIMA(0,1,1) with drift was selected by the auto.arima()
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5, p-value = 0.8067
##
## Model df: 2. Total lags used: 7
The residuals look like white noise, so its pretty good. Lets look at the forecast.
fc_austa_arima.0.1.1 <- forecast(
Arima(austa, order = c(0, 1, 1)), h = 10
)
autoplot(fc_austa_arima.0.1.1)fc_austa_arima.0.1.0 <- forecast(
Arima(austa, order = c(0, 1, 0)), h = 10
)
autoplot(fc_austa_arima.0.1.0)We can see that without the drift the forecasts are more stationary but the confidence interval is narrower with the MA term. These forecasts are no better than the naive model.
fc_austa_arima.2.1.3.drift <- forecast(
Arima(austa, order = c(2, 1, 3), include.drift = TRUE),
h = 6
)
autoplot(fc_austa_arima.2.1.3.drift)We have something of a dampened outcome with the drift.
fc_austa_arima.0.0.1.const <- forecast(
Arima(
austa, order = c(0, 0, 1), include.constant = TRUE
),
h = 10
)
autoplot(fc_austa_arima.0.0.1.const)fc_austa_arima.0.0.0.const <- forecast(
Arima(austa, order = c(0, 0, 0), include.constant = TRUE),
h = 10
)
autoplot(fc_austa_arima.0.0.0.const)The new forecasts have a sharp decrease followed by a naive trend.
The BoxCox transformation helps make the timeseries more linear, so it should be used in the forecasting.
lambda_usgdp <- BoxCox.lambda(usgdp)
usgdp_autoarima <- auto.arima(usgdp,
lambda = lambda_usgdp)
autoplot(usgdp, series = "Data") +
autolayer(usgdp_autoarima$fitted, series = "Fitted")## Series: usgdp
## ARIMA(2,1,0) with drift
## Box Cox transformation: lambda= 0.366352
##
## Coefficients:
## ar1 ar2 drift
## 0.2795 0.1208 0.1829
## s.e. 0.0647 0.0648 0.0202
##
## sigma^2 estimated as 0.03518: log likelihood=61.56
## AIC=-115.11 AICc=-114.94 BIC=-101.26
We can see a really good fit on the data with the ARIMA(2,1,0) with drift model.
## [1] 1
We need to difference once for stationarity.
We see a spike at lag 12, but our data is quarterly so this can be ignored.
## Series: usgdp
## ARIMA(1,1,0)
## Box Cox transformation: lambda= 0.366352
##
## Coefficients:
## ar1
## 0.6326
## s.e. 0.0504
##
## sigma^2 estimated as 0.04384: log likelihood=34.39
## AIC=-64.78 AICc=-64.73 BIC=-57.85
Lets also try an ARIMA with drift
usgdp_arima.1.1.0.drift <- Arima(
usgdp, lambda = lambda_usgdp, order = c(1, 1, 0),
include.drift = TRUE
)
usgdp_arima.1.1.0.drift## Series: usgdp
## ARIMA(1,1,0) with drift
## Box Cox transformation: lambda= 0.366352
##
## Coefficients:
## ar1 drift
## 0.3180 0.1831
## s.e. 0.0619 0.0179
##
## sigma^2 estimated as 0.03555: log likelihood=59.83
## AIC=-113.66 AICc=-113.56 BIC=-103.27
Everything seems like a really good fit.
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 1.195275 | 39.2224 | 29.29521 | -0.0136326 | 0.6863491 | 0.1655687 | -0.0382484 |
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 15.45449 | 45.49569 | 35.08393 | 0.3101283 | 0.7815664 | 0.198285 | -0.3381619 |
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | |
|---|---|---|---|---|---|---|---|
| Training set | 1.315796 | 39.90012 | 29.5802 | -0.0167859 | 0.6834509 | 0.1671794 | -0.0854457 |
We can see that the auto.arima() fit the data best.
The Forecasts seem reasonable and our confidence interval is tight around the predictions.
We have strong seasonality and an increasing trend. It seems like the seasonality is multiplicative.
From the ACF we can see that the autocorrelations get weaker over time, furthermore, we have a large lag every 4 periods which makes sense given that we have quarterly data.
We can see only 5 significant lags and then the effect is not as important. We may need to difference out to 4 or 5 lags.
We can see that differencing helped although one more lag might be useful.
We need an arima model with seasonality, so something like a (1, 1, 0)(0, 1, 1) with [4] differencing.
## Series: austourists
## ARIMA(1,0,0)(1,1,0)[4] with drift
##
## Coefficients:
## ar1 sar1 drift
## 0.4705 -0.5305 0.5489
## s.e. 0.1154 0.1122 0.0864
##
## sigma^2 estimated as 5.15: log likelihood=-142.48
## AIC=292.97 AICc=293.65 BIC=301.6
The auto.arima() gives an ARIMA(1, 0, 0)(1, 1, 0)[4] model.
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,0)(1,1,0)[4] with drift
## Q* = 4.0937, df = 5, p-value = 0.536
##
## Model df: 3. Total lags used: 8
We have residuals that look like white noise.
usmelec_ma2x12 <- ma(usmelec, order = 12, centre = TRUE)
autoplot(usmelec, series = "Data") +
autolayer(usmelec_ma2x12, series = "2X12-MA") +
ylab(expression(paste("Electricity(x", 10^{9}, "KWh)"))) +
ggtitle("Monthly total net generation of electricity") +
scale_color_discrete(breaks = c("Data", "2X12-MA"))We can see that the variation in data increases over time so a transformation is neccessary.
## [1] -0.5738331
The data is not stationary.
## [1] 1
## [1] 1
We need to do one difference and one seasonal difference to make the data stationary.
Lets add in the one differencing.
This leads to white noise residuals.
In this case I would try an ARIMA(0, 1, 2)(0, 1, 1)[12] with the BoxCox transformation.
We can see some seasonality, and a sudden increase in the price of copper later in the 2000s.
Lets see
We can see that the boxplot does make the data more stationary, taking account for some of the sudden price increase so this added stability would be useful for forecasting.
## Series: mcopper
## ARIMA(0,1,1)
## Box Cox transformation: lambda= 0.1919047
##
## Coefficients:
## ma1
## 0.3720
## s.e. 0.0388
##
## sigma^2 estimated as 0.04997: log likelihood=45.05
## AIC=-86.1 AICc=-86.08 BIC=-77.43
The auto.arima() gives an ARIMA(0,1,1) model with the lambda transform.
## [1] 1
## [1] 0
New models need to include 1 differencing.
We can see a sinusouidal decrease in the autocorrelation values so we can choose (1, 1, 0) or, lookin at the PACF, a (5, 1, 0).
mcopper_arima.1.1.0 <- Arima(mcopper, order = c(1, 1, 0), lambda = lambda_mcopper)
mcopper_arima.1.1.0## Series: mcopper
## ARIMA(1,1,0)
## Box Cox transformation: lambda= 0.1919047
##
## Coefficients:
## ar1
## 0.3231
## s.e. 0.0399
##
## sigma^2 estimated as 0.05091: log likelihood=39.83
## AIC=-75.66 AICc=-75.64 BIC=-66.99
mcopper_arima.5.1.0 <- Arima(mcopper, order = c(5, 1, 0), lambda = lambda_mcopper)
mcopper_arima.5.1.0## Series: mcopper
## ARIMA(5,1,0)
## Box Cox transformation: lambda= 0.1919047
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5
## 0.3706 -0.1475 0.0609 -0.0038 0.0134
## s.e. 0.0421 0.0450 0.0454 0.0450 0.0422
##
## sigma^2 estimated as 0.05028: log likelihood=45.32
## AIC=-78.63 AICc=-78.48 BIC=-52.63
The AIC is lower with the (5,1,0) model.
# AICc was -78.48.
# I'll try auto.arima function without approximation and stepwise options.
mcopper_autoarima2 <- auto.arima(
mcopper, lambda = lambda_mcopper,
approximation = FALSE, stepwise = FALSE
)
mcopper_autoarima2## Series: mcopper
## ARIMA(0,1,1)
## Box Cox transformation: lambda= 0.1919047
##
## Coefficients:
## ma1
## 0.3720
## s.e. 0.0388
##
## sigma^2 estimated as 0.04997: log likelihood=45.05
## AIC=-86.1 AICc=-86.08 BIC=-77.43
I am going to proceed with the auto.arima boxcox transformed model. Lets check the residuals.
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1)
## Q* = 22.913, df = 23, p-value = 0.4659
##
## Model df: 1. Total lags used: 24
We can see that the residuals are pretty good.
The forecast doesn’t look good, we can try some of the other models.
The (1,1,0) model has the same outcome.
The (5,1,0) produces about the same thing.
Lets look at hsales.
This data looks pretty stationary, and so I will not be transforming
We can confirm whether or not the data is stationary by using the ndiffs() and the nsdiffs().
## [1] 0
## [1] 1
As suspected, there is no need to difference, although there is a seasonal pattern.
We can see that we should add the seasonal differencing. Lets see the auto.armia() give us the numbers.
## Series: hsales
## ARIMA(1,0,0)(1,1,0)[12] with drift
##
## Coefficients:
## ar1 sar1 drift
## 0.8867 -0.4320 -0.0228
## s.e. 0.0294 0.0569 0.1642
##
## sigma^2 estimated as 27.92: log likelihood=-811.38
## AIC=1630.76 AICc=1630.92 BIC=1645.05
We can see that there is also a drift component and the auto.armia() decided on 1st differencing.
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,0,0)(1,1,0)[12] with drift
## Q* = 39.66, df = 21, p-value = 0.008177
##
## Model df: 3. Total lags used: 24
We can see that the residuals could be better but they are not the worst.
The forecasts look reasonable, although they are very stationary (we dont see any shigt upwards or downwards).
The stlf() functions does quite a similar job, I would honestly think that it is better because the seasonal trend is very present in hsales.
The ARIMA model is (3,1,0).
We can see that this is appropriate because the significant spikes of the PACF go out to 3 lags.
Below are the forecasts for the next three years.
sheep.1940 = 1797 + 0.42*(1797 - 1791) -0.20*(1791 - 1627) - 0.30*(1627 - 1665)
sheep.1941 = sheep.1940 + 0.42*(sheep.1940 - 1797) -0.20*(1797 - 1791) - 0.30*(1791 - 1627)
sheep.1942 = sheep.1941 + 0.42*(sheep.1941 - sheep.1940) -0.20*(sheep.1940 - 1797) - 0.30*(1797 - 1791)
c(sheep.1940, sheep.1941, sheep.1942)## [1] 1778.120 1719.790 1697.268
## Time Series:
## Start = 1940
## End = 1942
## Frequency = 1
## [1] 1777.996 1718.869 1695.985
We have basically the same foretasted values, the differences are likely due to rounding errors. ## Question 17 ### a)
This is an ARIMA(4, 0, 0) or AR(4) model.
ACF plot shows sinusoidally decreasing autocorrelation values while the PACF plot shows significant spikes at lag 1 and 4, but none beyond lag 4 which is why the AR(4) model was chosen.
c = 162.00
phi1 = 0.83
phi2 = -0.34
phi3 = 0.55
phi4 = -0.38
bicoal.1969 <- c + phi1*545 + phi2*552 + phi3*534 + phi4*512
bicoal.1970 <- c + phi1*bicoal.1969 + phi2*545 + phi3*552 + phi4*534
bicoal.1971 <- c + phi1*bicoal.1970 + phi2*bicoal.1969 + phi3*545 + phi4*552
c(bicoal.1969, bicoal.1970, bicoal.1971)## [1] 525.8100 513.8023 499.6705
y <- Quandl.datatable('ZILLOW/DATA', indicator_id='ZSFH', region_id='99999')
y <- y %>%
arrange(date)
# We just want the values
y <- ts(y$value, start = c(2005, 01), frequency = 12)We are going to use the ZHVI Single-Family Homes Time Series ($)
We can see an upward trend without seasonality. Lets look at the breakdown by ACF and PACF
We can see that there is strong autocorrelation (which is expected) in addition to one large 1 lag spike for the PACF.
Lets see what ndiffs() suggests.
## [1] 2
Because ndiffs() suggests 2 differencing, I would suggest an ARIMA(0, 2, 2)(0, 0, 1) model.
## Series: y
## ARIMA(0,2,2)(0,0,1)[12]
##
## Coefficients:
## ma1 ma2 sma1
## -1.3537 0.4674 -0.5290
## s.e. 0.0624 0.0646 0.1213
##
## sigma^2 estimated as 9677291: log likelihood=-1827.74
## AIC=3663.48 AICc=3663.69 BIC=3676.53
R agrees with the analysis and we have a linear exponential smoothing model.
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,2,2)(0,0,1)[12]
## Q* = 6.0344, df = 21, p-value = 0.9994
##
## Model df: 3. Total lags used: 24
Besides the anomaly that is 2020, we have white noise.
It is interesting that the ARIMA forecasts a spike in the future - maybe this will absorb the trend decrease from the lead up to 2020?
Lets have R fit and ets() model.
## ETS(M,A,N)
##
## Call:
## ets(y = y)
##
## Smoothing parameters:
## alpha = 0.7511
## beta = 0.0848
##
## Initial states:
## l = 162098.7659
## b = 1774.4036
##
## sigma: 0.0103
##
## AIC AICc BIC
## 4089.984 4090.302 4106.349
We have an ets() with a moving average and an additive trend. As stated before, we do not have seasonality so the seasonality component is null.
Lets check the residuals of the R chosen etf() model.
##
## Ljung-Box test
##
## data: Residuals from ETS(M,A,N)
## Q* = 22.751, df = 20, p-value = 0.3012
##
## Model df: 4. Total lags used: 24
We can see that the ets() handles the data pretty well and separates out a lot of the white noise.
The ets() model does not produce a small bump which is interesting but the confidence interval varies much wider into the future.
I would prefer the ARIMA model because of the narrower confidence intervals.